A link-bridged topic model for cross-domain document classification
نویسندگان
چکیده
0306-4573/$ see front matter 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.ipm.2013.05.002 ⇑ Corresponding author at: Department of Computer Science, South China University of Technology, Guangzhou, China. Tel.: +852 39438461; f 26035505. E-mail addresses: [email protected] (P. Yang), [email protected] (W. Gao), [email protected] (Q. Tan), [email protected] (K.-F. Wong) Pei Yang a,c,⇑, Wei Gao , Qi Tan , Kam-Fai Wong c
منابع مشابه
A New Document Embedding Method for News Classification
Abstract- Text classification is one of the main tasks of natural language processing (NLP). In this task, documents are classified into pre-defined categories. There is lots of news spreading on the web. A text classifier can categorize news automatically and this facilitates and accelerates access to the news. The first step in text classification is to represent documents in a suitable way t...
متن کاملA Joint Semantic Vector Representation Model for Text Clustering and Classification
Text clustering and classification are two main tasks of text mining. Feature selection plays the key role in the quality of the clustering and classification results. Although word-based features such as term frequency-inverse document frequency (TF-IDF) vectors have been widely used in different applications, their shortcoming in capturing semantic concepts of text motivated researches to use...
متن کاملیک مدل موضوعی احتمالاتی مبتنی بر روابط محلّی واژگان در پنجرههای همپوشان
A probabilistic topic model assumes that documents are generated through a process involving topics and then tries to reverse this process, given the documents and extract topics. A topic is usually assumed to be a distribution over words. LDA is one of the first and most popular topic models introduced so far. In the document generation process assumed by LDA, each document is a distribution o...
متن کاملA Document Weighted Approach for Gender and Age Prediction Based on Term Weight Measure
Author profiling is a text classification technique, which is used to predict the profiles of unknown text by analyzing their writing styles. Author profiles are the characteristics of the authors like gender, age, nativity language, country and educational background. The existing approaches for Author Profiling suffered from problems like high dimensionality of features and fail to capture th...
متن کاملBayesian Bridging Topic Models for Classification
We study the problem of constructing the topic-based model over different domains for text classification. In real-world applications, there are abundant unlabeled documents but sparse labeled documents. It is challenging to construct a reliable and adaptive model to classify a large amount of documents containing different domains. The classifiers trained from a source domain shall perform poo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Inf. Process. Manage.
دوره 49 شماره
صفحات -
تاریخ انتشار 2013